perm filename DBL4.TEX[PEG,DBL]3 blob sn#506016 filedate 1980-04-22 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00025 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00004 00002	\init{
C00005 00003	\chapbegin{4}		% Here beginneth Chapter 4
C00008 00004	\sectionbegin[1]{Syntax of the Heuristics}
C00011 00005	\subsectionbegin[1.1]{Syntax of the Left-Hand Side}
C00018 00006	\subsectionbegin[1.2]{Syntax of the Right-Hand Side}
C00021 00007	\sectionbegin[2]{Heuristics Suggest New Tasks}
C00022 00008	\subsectionbegin[2.1]{An Illustration: ``Fill in Generalizations of Equality''}
C00032 00009	\subsectionbegin[2.2]{The Ratings Game}
C00044 00010	\sectionbegin[3]{Heuristics Create New Concepts}
C00046 00011	\subsectionbegin[3.1]{An Illustration: Discovering Primes}
C00057 00012	\subsectionbegin[3.2]{The Theory of Creating New Concepts}
C00063 00013	\subsectionbegin[3.3]{Another Illustration: Squaring a Number}
C00070 00014	\sectionbegin[4]{Heuristics Fill in Entries for a Facet}
C00075 00015	\subsectionbegin[4.1]{An Illustration: ``Fill in Examples of Set-union''}
C00083 00016	\subsectionbegin[4.2]{Heuristics Propose New Conjectures}
C00088 00017	\subsectionbegin[4.3]{An Illustration: ``All Primes Except 2 are Odd''}
C00097 00018	\subsectionbegin[4.4]{Another Illustration: Discovering Unique Factorization}
C00108 00019	\sectionbegin[5]{Gathering Relevant Heuristics}
C00110 00020	\subsectionbegin[5.1]{Domain of Applicability}
C00120 00021	\subsectionbegin[5.2]{Rippling}
C00126 00022	\subsectionbegin[5.3]{Ordering the Relevant Heuristics}
C00134 00023	\sectionbegin[6]{{\AM}'s Starting Heuristics}
C00135 00024	\subsectionbegin[6.1]{Heuristics Grouped by the Knowledge They Embody}
C00141 00025	\subsectionbegin[6.2]{Heuristics Grouped by How Specific They Are}
C00144 ENDMK
C⊗;
\init{
\input cmpdes
\input ammacs
\def\draft{T}
\titlepage
\newpage
\setcount0 35
\parttitle{AM: DISCOVERY IN MATHEMATICS AS HEURISTIC SEARCH}
}
\baselineskip 12pt
\chapbegin{4}		% Here beginneth Chapter 4
\chaptertitle{HEURISTICS}
\rjustline{\:B Heuristics}
\runningrighthead{INTRODUCTION}
\tenpoint
\vskip 14pc

\noindent Assume that somehow {\AM} has selected a particular task from the
agenda---say, ``{\bf Fill-in  Examples of Primes}.''  What precisely does {\AM} do
in order to execute the  task? How {\sl are} examples of primes  filled
in?

The answer can be compactly stated as follows:

\ctrline{\bf {\AM} selects relevant heuristics, and executes them.}

\yskip

This really just splits our original  question into two new ones: ({\it i\/})
How are  the relevant heuristics selected, and ({\it ii\/}) What does it mean
for heuristics to be  executed (\eg, how does executing  a heuristic
rule help to fill in examples of primes?).


These two  topics (in  reverse order) are  the two major  subjects of
this chapter. Although several examples of heuristics will be  given,
the complete  list is relegated  to Appendix  \ffootnote{2.}{There
they are condensed and phrased in English.  The reader wishing to see
examples of the heuristics as they actually were coded in LISP should
glance at Appendix 3.2.}

The  first section  explains what  heuristic rules  look like  (their
``syntax,''  as it were).  The next three sections  illustrate how they
can be executed to achieve their desired results (their ``semantics'').

Section 4--5 explains  where the rules are  stored and how  they are
accessed at the appropriate times.


Finally, the initial body of heuristics is analyzed.
The informal knowledge they contain is categorized and described.
Unintentionally,
the distribution of heuristics among the concepts is quite nonhomogeneous;
this too is described in Section 4--6.

\sectionskip
\sectionbegin[1]{Syntax of the Heuristics}
Let's  start by  seeing  what a  heuristic rule  looks  like.   In
general (see [Davis & King 75] for historical references to production rules), 
it will have the form

{\parindent 45pt \parskip 1pt

If {\sl <situational fluent>}

Then {\sl <actions>}

} 

\parindent 20pt

As an  illustration, here is a heuristic  rule, relevant when checking examples
of anything:

\han2{{\it If the current task is to Check Examples of any concept X,}}

\han3{{\it and (for some Y) Y is a generalization of X,}}

\han3{{\it and Y has at least 10 examples,}}

\han3{{\it and all examples of Y are also  examples of X,}}


\han2{{\it Then print the following conjecture: \sl `` X is really no more specialized
than Y,''}}

\han3{{\it and add it to the Examples facet of the concept named ``Conjectures,''}}

\han3{{\it
and 
add the following task to the agenda:
{\bf ``Check  examples of Y''}, for the  reason:
``Just as Y was no more general  than X, one-of Generalizations(Y) may
turn out to be no more general than Y,'' with a rating for that reason
computed as the average of   $\|$Examples(Generalizations(Y))$\|$,
$\|$Examples(Y)$\|$, and Priority(Current task).}}

\yskip

\noindent As with production rules, 
and formal grammatical rules, each of {\AM}'s heuristic rules
has a left-hand side and a right-hand side.
On the left is a test to see whether the rule is applicable, and on the right
is a list of actions to take if the rule applies.
The left-hand side will also be called the IF-part, the predicate, the
preconditions, left side, or the situational fluent of the rule.
The right-hand side will sometimes be referred to as the THEN-part, the
response, the right side, or the actions part of the rule.

\subsectionbegin[1.1]{Syntax of the Left-Hand Side}
The situational fluent (left-hand side)
is  a LISP predicate, a function  which always
returns True or False (in LISP, it actually returns either the atom T
or the  atom NIL).   This  predicate may  investigate  facets of  any
concept (often merely to see whether  they are empty or not), use the
results of recent tests and behaviors (\eg, to see how much cpu time
{\AM} spent trying to work on a certain task), etc.

The left side  is a conjunction  of the 
form  $P↓1 \meet P↓2  \meet \ldots$.  All  the
conjuncts, except the very  first one, are arbitrary LISP predicates.
They are constrained only to obey two commandments:

\listbegin
\hddlistentry[Be quick!]{(return either True or False in under 0.1 cpu seconds)}
\hddlistentry[Have no side effects!]{(destroying or creating list structures
or Lisp functions, resetting global variables)}
\listend

Here are some sample  conjuncts that might appear inside  a left-hand
side (but {\sl not} as the very first conjunct):

{\it\listbegin
\bullistentry{More than half of the current  task's time quantum is already
exhausted,\ellipsis}

\bullistentry{There are some known examples of Structures,\ellipsis}

\bullistentry{Some  generalization  of  the current  concept  (the  concept
mentioned  as  part  of  the  current task)  has  an  empty  Examples
facet,\ellipsis}

\bullistentry{The space quantum  of the current task is  gone, but its time
allocation is less than 10{\rm\%} used up,\ellipsis}

\bullistentry{A task recently  selected had the form {\bf ``Restructure  facet F
of  concept  X''},  where F  is  any  facet,  and  X is  the  current
concept,\ellipsis}

\bullistentry{The user has used this system at least once before,\ellipsis}

\bullistentry{It's Tuesday,\ellipsis}

\listend}

The very first conjunct of each left-hand side is special. Its syntax
is highly constrained.   It specifies the domain of  applicability of
the rule, by  naming a particular facet of a particular concept to which
this rule is relevant.

{\AM} uses this first conjunct as a fast ``pre-precondition,'' so that the
only rules whose  left-hand sides get evaluated are  already known to
be  somewhat relevant  to the  task at  hand. In fact,  {\AM} physically
attaches each rule  to the facet and  concept mentioned in its  first
\ffootnote{conjunct.}{Sometimes,  I  will  mention  where a  certain  rule  is
attached; in that  case, I  can omit  explicit mention  of the  first
conjunct.  Conversely, if I include that conjunct, I needn't  tell you
where the rule is  stored.}\par
Here  are two typical  examples of allowable
first conjuncts:

{\it\listbegin

\bullistentry{The current task (the one last selected from the agenda) is of
the form {\bf ``Check the Domain/range facet of  concept X''}, where X is
any operation}

\bullistentry{The current task is of the  form {\bf ``Fillin the examples facet
of the Primes concept''}}

\listend}

These are the only guidelines which the left-hand side of a heuristic
rule  must   satisfy.  Any  LISP  predicate   which  satisfies  these
constraints  is a  syntactically  valid left-hand  side for  a heuristic
rule.
It turned out later that this excessive freedom made it difficult for
{\AM} to inspect  and analyze and synthesize its  own heuristics; such a
need was not foreseen at the time {\AM} was designed.

Because of this  freedom, there  is not much  more to  say about  the
left-hand sides of rules. As you  encounter heuristics in the
next  few sections,  you should  notice  the (unfortunate)  variety of
conjuncts which may occur as part of their left-hand sides.

\subsectionbegin[1.2]{Syntax of the Right-Hand Side}
``Running'' the left-hand side means evaluating the series of conjoined
little  predicates there, to see  if they all return  True. If so, we
say that the  rule ``triggers.'' In that  case, the right-hand side  is
``run,''  which means  executing all  the actions  specified there.   A
single  heuristic rule  may  have a  list of  several actions  as its
right-hand side.  The actions are executed in order, and  we then say
the rule has finished running.

Only  the right-hand side of  a heuristic  rule is permitted  to have
side effects.  The right  side of a rule is  a series of little  LISP
functions, each of which is called an {\sl action}.

Semantically,  each   action  performs   some  processing  which   is
appropriate  in some  way to  the  kinds of  situations in  which the
left-hand side would have triggered.  The final value that the action
function returns is irrelevant.

Syntactically, there  is only one  constraint which each  function or
``action''  must  satisfy:  Each  action has  one  of  the  following 3
side-effects, and no other side-effects:

\listbegin
\bullistentry{It suggests a new task for the agenda.}
\bullistentry{It causes a new concept to be created.}
\bullistentry{It adds (or  deletes) a certain entry  to a particular
facet  of a particular concept.}
\listend

\noindent To repeat: the right side  of a rule contains a list of actions, each
of which is one of  the above three types.  A single rule might  thus
result in the creation of several new  concepts, the addition of many
new tasks  to the agenda, and  the filling in of  some facets of some
already existing concepts.
These three kinds of actions  will now be discussed in the  following
three sections.

\sectionskip
\sectionbegin[2]{Heuristics Suggest New Tasks}
This section discusses the ``proposing a new task'' kind of action.
Here is the basic idea in a nutshell: The left-hand side of a rule triggers.
Scattered  among  the ``things  to do''  in its right-hand side are 
some suggestions  for future  tasks. These
new tasks are  then simply  added to  the agenda  list.

\subsectionbegin[2.1]{An Illustration: ``Fill in Generalizations of Equality''}
If a new task is  suggested by a heuristic rule, then  that rule must
specify how to assemble  the new task, how to get reasons for it, and
how to  evaluate  those reasons.   For  example,  here is  a  typical
heuristic rule which proposes a new task to add to the agenda. It says
to generalize a predicate if it is very \ffootnote{rarely}{The most
suspicious part of the situational fluent (the IF-part) is the 
number ``.05.'' Where
did it come from? Hint: if all humans had f fingers per hand,
h hands, t toes per foot, and F feet, this would probably be
1/(fh+tF). Seriously, one can change this value (to .01 or to .25)
with virtually no change in {\AM}'s behavior. This is the conclusion of
experiment 3 (see Section 6.2.3). Such empirical
justification is one
important reason for actually writing and running large programs like {\AM}.}
satisfied:

\vskip \listskip
\han2{{\it If the current task was (Fill-in examples of X),}}

\han3{{\it and X is a predicate,}}

\han3{{\it and more than 100 items are known in the domain of X,}}

\han3{{\it and at least 10 cpu seconds were spent trying to randomly instantiate X,}}

\han3{{\it and the ratio of successes/failures is both >0 and less than .05}}


\han2{{\it Then  add the following task to  the agenda: 
{\bf  (Fill-in generalizations of X)},}}

\han3{{\it for the following reason:}}

\han3{{\it ``X is rarely satisfied; a less restrictive  concept might be
more interesting.''}}

\han3{{\it This  reason's  rating  is  computed  as  three times
the  ratio  of non\-ex\-am\-ples/ex\-am\-ples found.}}
\vskip \listskip

\noindent Even this  is one  full step  above the  actual LISP  implementation,
where ``{\bf X is a predicate}''  would be coded as ``{\bf (MEMBER X (EXAMPLES
PREDICATE))}.''  The function  {\bf EXAMPLES(X)} rummages  about looking
for already-existing  examples of  X.
Also, the LISP code contains information for normalizing all the numbers
produced, so that they will lie in the range 0--1000.

Let's examine an instance of where this rule was used. At some point,
{\AM} chose the task ``{\bf Fillin examples of List-equality}.''  One of the
ways it filled in examples of this  predicate was to run it on  pairs
of randomly chosen lists, and observe whether the  result was True or
\ffootnote{False.}{The True ones became examples of List-equality, and the pairs
of  lists  which  didn't  satisfy  this  predicate  became  known  as
non-examples (failures,  foibles, etc.).  A heuristic similar  to this
``random  instantiation'' one is illustrated  in Section 4--3.8.}
Say that 244 random pairs of lists were tried, and only twice was
this predicate satisfied.  Sometime later, the IF part
of the above heuristic  is examined.  All the  conditions are  met, so  it
``triggers.'' 
For example, the ``ratio of successes to failures'' is just 2/242, which is
clearly greater than zero and less than 0.05.
So the  right-hand side (THEN-part) of the above rule is  executed. 
The right-hand side initiates only one action: the task
``{\bf Fillin generalizations of List-equality}'' is added to the agenda,
tagged with the reason ``List-equality is rarely satisfied; a slightly
less restrictive concept might be  more interesting,'' and that reason
is assigned a numeric rating of $3\times (242/2) = 363$.

Notice  that the heuristic  rule above supplied a  little function to
compute  the  value  of  the  reason.  That formula was:
``three  times   the  ratio  of non\-ex\-am\-ples/ex\-am\-ples
\footnote{found.''}{In actuality, this would be checked to ensure  that the
result lies between 0 and 1000.}  Functions  of this type, to compute the
rating  for  a   reason,  satisfy  the   same two  constraints  as   the
left-hand side did: the function  must be very fast and  it must have
no side effects. The ``intelligence'' that {\AM} exhibits in selecting which
task to work on  ultimately depends on the accuracy of these local rule
evaluation formulae. Each one is so specialized that it is ``easy'' for it to
give a valid result; the range of situations it must judge is quite narrow.
Note that these little formulae were handwritten, individually, by the
author. {\AM} wasn't able to create new little reason-rating formulae.

The  reason-rating function  is evaluated  at the  moment the  job is
suggested,  and only  the  numeric  result  is  remembered,  not  the
original function.  In other words, we  tack on a list of reasons and
associated  numbers,  for  each  job on  the  agenda.  The agenda {\sl doesn't}
maintain copies of the reason-rating functions which gave those  numbers. 
This simplification is used merely to save the system some space and time.

Let's turn now from the reason-rating formulae to the reasons themselves.
Each reason supporting  a newly suggested
job is simply an English sentence (an  opaque string, a token). 
{\AM} cannot do much intelligent processing on these reasons. 
{\AM} is  not
allowed to inspect parts of it, parse it, transform it, etc. The most
{\AM}  can do  is compare two  such tokens  for equality.  Of course, it
is not to hard to imagine this capability extended to  permit {\AM}  to
syntactically analyze such strings, or to trivially compute some sort
of ``difference'' between two given reasons.\ffnote{It is in fact
trivial to {\sl imagine} it. Of course {\sl doing} it is quite a bit
less trivial. In fact, it probably is the toughest of all the ``open research
problems'' I'll propose.}  Each reason is assumed to have some semantic
impact on the user, and is kept around partly for that purpose.

Each reason for task $\tau$ has a numeric rating (a number between 0 and 1000)
assigned to it locally by the heuristic rule which proposed $\tau$
for that reason. One global formula  then combines
all the reasons' ratings into one single priority value for the task.

\subsectionbegin[2.2]{The Ratings Game}
In general, a  task on the agenda  list will have several  reasons in
support  of it.   Each  reason consists  of an  English phrase  and a
numeric rating.  How can a task have more than one reason?  There are
two  contributing  factors: ({\it i\/})  A  single  heuristic rule  can  have
several reasons in support of a job it suggests, and ({\it ii\/}) When a rule
suggests a ``new'' task, that  very same task may already exist  on the
agenda, with quite  distinct reasons tacked on there.   In that case,
the new reason(s) are added to the already-known ones.

One global formula  looks at  all the  ratings for  the reasons,  and
combines them into a  single priority value for the task  as a whole.
Below is that formula, in all its gory detail:
$$Worth(J) =  \|\sqrt{\Sigma R↓i↑2}\,\| \times \biglp .2\cdot Worth(A) + .3\cdot Worth(F) + .5\cdot Worth(C)\bigrp$$

\noindent (where J = job to be judged = (Act A, Facet F, Concept C)
and $\{R↓i\}$ are the ratings of the reasons supporting J).

\yskip

For example, consider  the job J = {\bf (Check examples of Primes)}.   The
act A would be  ``{\bf Check},'' which has a numeric worth of 100.  The facet
F would be ``{\bf Examples},'' which has a numeric worth of 700.  The concept
C would  be ``{\bf Primes},'' which  at the moment  might have a Worth  of 800.
Say  there were four reasons,  having values 200,  300, 200, and 500.
The double lines  ``$\|\ldots\|$'' indicate normalization, which  means that
the final  value of the  square-root must be  between 0 and  1, which
is done by dividing  the result  of the  Square-root by  1000
and then truncating to 1.0 if the result exceeds unity.

In this case, we first compute 
$\sqrt{200↑2 + 300↑2  + 200↑2 + 500↑2} = \sqrt{420,000}$, which is about 648.  
After normalization, this becomes
0.648.  The expression in large parentheses in the formula
is actually computed
as  the  dot-product of  two  vectors, namely,  (Worth(A), Worth(F),
Worth(C)) and (.2   .3  .5);
in this case it is the dot-product  of (100 700 800) and (.2 .3  .5),
which yields  630. This is  multiplied by the  normalized Square-root
value, 0.648, and we end up with a final priority rating of 408.

The  four reasons  each have  a  fairly low  priority, and  the total
priority of the task is  therefore not great. It is, however,  higher
than any  single reason multiplied by  0.648.  This is  because there
are  many {\sl distinct}  reasons  supporting it.   The  global formula
uniting these  reasons' values does  not simply  take the largest  of
them (ignoring the rest), nor does it simply add them up.

The above  formula was intended originally  as a first  pass, an {\it ad
hoc} guess, which I expected I'd have to modify later. Since it  has
worked successfully, I have  not messed with it.  There  is no reason
behind it, no  justification for taking dot-products of vectors, etc.
I  concluded, and  recent  experiments  tend  to  confirm,  that  the
particular  form of  the formula  is unimportant;  only some  general
characteristics need be present:

\listbegin
\bullistentry{The priority value of a task is a monotone increasing function of
each of its reasons' ratings.   If a new supporting reason  is found,
the  task's value  is increased.   The  better that  new  reason, the
bigger the increase.}

\bullistentry{If an already-known supporting reason is re-proposed, the value of
the task is  {\sl not} increased (at least, it's not increased very much).
Like  humans, {\AM} is  fooled whenever the same reason reappears in disguised
form.}

\bullistentry{The priority of a  task involving concept C  should be a monotone
increasing function of  the overall worth of  C. Two similar  tasks
dealing  with two  different concepts,
each supported by the same list of reasons and reason ratings,  should be
ordered by the worth of those two concepts.}
\listend

\noindent I believe that all these criteria are absolutely essential to good
behavior of  the  system.   Several of the experiments discussed later bear on
this question (see Section 6--2).
Note that the messy formula does incorporate all  three
of these constraints.  In addition,  there are a few features of that
formula  which, while probably  not necessary or  even desirable, the
reader should be informed of explicitly:

\listbegin
\bullistentry{The task's value does not depend on the order in which the reasons
were discovered.  This is not true psychologically  of people, but it
is a feature of the particular priority-estimating formula  initially
selected.}

\bullistentry{Two reasons  are  either considered  identical  or unrelated.  No
attempt is  made to reduce the priority  value because several of the
reasons are overlapping semantically or even just syntactically. This,
too, is no doubt a mistake.}

\bullistentry{There is no need to keep around all the individual reasons' rating
numbers.  The addition of a new reason will demand only the knowledge
of the {\sl number} of other reasons, and the old priority value of the
task.}

\bullistentry{A task  with no reasons  gets an  absolute zero  rating.   As new
reasons are added, the priority  slowly increases toward an  absolute
maximum which is dependent upon the overall  worth of the concept and
facet involved.}
\listend

\noindent There is one topic of passing interest which should be covered here.
Each possible Act A (\eg, Fillin, Check, Apply) and each possible facet
F (\eg, Examples, Definition, Name(s)) are assigned a fixed numeric value
(by hand, by the author).
These values are used inside the formula on the last page, where it says
``Worth(A)'' and ``Worth(F).''  They are fairly resistant to change, but
certain orderings should be maintained for best results. For example, ``Examples''
should be rated higher than ``Specializations,'' or else {\AM} may whirl away
on a cycle of specialization long after the concept has been constrained
into vacuousness. As for the Acts, their precise values turned out to be even less
important than the Facets'.

Now that we've  seen how to compute this priority value for any given
task, let's not forget what it's used for. The overall rating has two
functions:

\listbegin
\numlistentry[1]{These ratings determine which task
to execute  next. This  is not  an  ironclad policy:  In reality,  {\AM}
prints  out  the top  few  tasks,  and the  user  has  the option  of
interrupting and directing  {\AM} to work  on one of  those other  tasks
instead of the very top one.}

\numlistentry[2]{Once a  task is chosen,  its  rating determines  how much
time and space {\AM}  will expend on it before quitting and moving on to
a new  task.  The  precise formulae  are unimportant.   Roughly,  the
0--1000 rating is divided by ten  to determine how much time to allow,
in  cpu seconds. The rating  is divided by two  to determine how much
space to allow, in list cells.}
\listend

\sectionskip
\sectionbegin[3]{Heuristics Create New Concepts}
Recall that a heuristic  rule's actions are of three types:

\listbegin
\bullistentry{Suggest new tasks and add them to the agenda.}
\bullistentry{Create a new concept.}
\bullistentry{Fill in some entries for a facet of a concept.}
\listend

\noindent This section discusses the second activity.

\yskip

Here is the basic idea:
Scattered among the ``things to do'' in the right-hand side of a rule
are some requests to create specific new concepts. For each such request,
the heuristic rule
must specify how to construct it. At least, the rule must
specify  ways  of assembling  enough  facets  of the  new  concept to distinguish
it  from all  the other known  concepts. Typically,  the
rule will explain how to  fill in the Definition of---or an Algorithm
for---the new concept.  After executing these instructions, the new
concept will ``exist,'' and a few of its facets will be filled in, and
a few new jobs will probably exist on the agenda, indicating that {\AM} might want
to fill in certain other facets of this new concept in the near future.

\subsectionbegin[3.1]{An Illustration: Discovering Primes}
Here is a heuristic rule  that results in a new concept being created:

\han2{{\it If the current task was {\bf (Fill-in examples of F)},}}

\han3{{\it and F is an operation from domain space A into range space B,}}

\han3{{\it and more than 100 items are known examples of A (in the domain of F),}}

\han3{{\it and more than 10 range items (in B) were found by applying F to these
domain items,}}

\han3{{\it and at least  1 of these range items is a distinguished member (esp:
extremum\ffootnote{)}{This is handled as follows:
{\AM} takes the given list of range items. It eliminates
any which are not interesting (according to Interests(B))  or extreme (an entry
on B.Exs-Bdy, the boundary examples of B).
Finally, all those extreme range items are moved to
the front of this list. 
{\AM} begins walking down this list, creating
new concepts according to the rule. Sooner or later,
a timer (or a storage-space watcher) will terminate this costly activity. 
Only the frontmost
few range items on the list will have generated new concepts.
So ``especially'' really just means priority consideration.}
of B,}}


\han2{{\it Then (for each such distinguished member
 `b'$\epsilon$B) create the following new concept:}}


\yskip

\ragged 1000000 

\moveright 30pt \inbox{\hbox
{\vbox 
{\hbox{
\bf Name:  \it F\inv -of-b } \hbox{ 
\bf Definition: \it $\lambda (x) \biglp F(x) = b \bigrp $ } \hbox{ 
\bf Generalization: {\it A } } \hbox{ 
\bf Worth: \it $ Average \biglp Worth(A), Worth(F), Worth(B), Worth(b), $ } \hbox{ 
\hskip 80pt $100 \times max(10, \|Examples(B)\|)\bigrp $ } \hbox{ 
\bf Interest: \it Any conjecture involving both this concept and either F or F\inv 
}}}}

\ragged 0

\han5{\it In case  the user  asks, the  reason for  doing this is:  
``Worthwhile
investigating those A's  which have an unusual F-value, namely, those
whose F-value is b''}

\yskip

\han3{\it and the total  amount of time  to spend  
right now  on all  of these  new
concepts is computed as: }

\han4{\it Half the  remaining cpu time in  the current
task's time quantum.}

\han3{\it and the total  amount of space to spend  right now  on each of these  new
concepts is computed as: }

\han4{\it 90{\rm \%} \ of the remaining space quantum for the current task.}

\noindent \rm Heuristics for the new concept are quite hard to fill in.
This was one of {\AM}'s most serious limitations, in fact (see Chapter 7).
Above, we see a trivial kind of ``heuristic schema'' or template, which gets
instantiated to provide one new, specialized heuristic about the new concept.
That new heuristic tells how to judge the interestingness of any conjecture
which crops up involving this new concept. As new conjectures get proposed,
they are evaluated by calling on  a set of heuristics including this one.

Although some examples of {\bf F\inv -of-b} might be easily obtained
(or already known) at the moment of its creation, the above rule doesn't specifically
tell {\AM} how to fill in that facet. The very last line of the heuristic indicates
that a few cpu seconds may be spent on just this sort of activity: filling in facets
of the new concept which, though not explicitly mentioned in the rule, are
easy to fill in now.
Any facet {\it f} which didn't get filled in ``right now'' will
probably cause a new task to be added to the agenda, of the form: ``{\bf Fillin
facet {\it f} of concept {\it F\inv -of-b}}.''
Eventually, {\AM} would choose that task, and spend a large quantum of time working
on that single facet.

Now let's look at an instance of when this heuristic was used. At one
point,  {\AM}   was  working   on   the  task   ``{\bf Fill-in  examples   of
Divisors-of}.''

This  heuristic's  IF-part was  triggered because: Divisors-of is  an
operation (from Numbers to Sets  of  numbers), and far more than 100  different
numbers  are  known, and  more than 10  different  sets  of  factors  were  found
altogether,  and some  of them  were  distinguished by  being extreme
kinds of sets: empty-sets, singletons, doubletons  and  tripletons.   

After its left side triggered, the right side of the heuristic rule was executed.
Namely, four new concepts were created  immediately. Here is one of
them:

\yskip

\ragged 1000000 

\moveright 30pt \inbox{\hbox
{\vbox 
{\hbox{
\bf Name:  \it Divisors-of\inv -of-Doubleton } \hbox{ 
\bf Definition: \it $\lambda (x)\ \biglp$  Divisors-of({\it x}) 
is a Doubleton$\bigrp $ } \hbox{ 
\bf Generalization: {\it Numbers } } \hbox{ 
\bf Worth: \it 100 } \hbox{ 
\bf Interest: \it Any conjecture involving both this concept and either}\hbox{
\it Divisors-of or Times
}}}}

\ragged 0

\yskip\rm

\noindent This is a  concept representing a  certain class of numbers,  in fact
the  numbers  we  call {\sl primes}.  The  heuristic  resets a  certain
variable, so that in  case the user  interrupts and asks {\bf Why?}, {\AM} responds:
{\it ``This concept was created because it's worthwhile investigating 
those numbers which 
have an extreme divisors-of value; in this case, numbers 
which have only two divisors.'' }

{\AM} was willing to spend half
the  remaining  quantum of  time  allotted  to 
the task  {\bf ``Fillin examples  of
Divisors-of''} on these four new \ffootnote{concepts.}{Some trivial details: One-eighth
of the remaining time is spent on each of
these four concepts: Numbers-with-0-divisors, Numbers-with-1-divisor,
Numbers-with-2-divisors, Numbers-with-3-divisors.
The original time/space limits were  in reality about 25 cpu  seconds and
800  list cells, and  at the moment  this heuristic  was called, only
about 10 seconds and 600  cells remained, so  the concept  Primes
was  allotted only  1.2 cpu  seconds  to  ``get off  the
ground.'' This was no problem, as it used far less than that.
The heuristic rule states that each of the four new concepts may use up
to 90\%\  of the
remaining space allocation (540 out of 600 cells),
and Primes needed only a fraction of
that initially.}\par
The heuristic rule is  applicable to any operation, not  just numeric
ones.     For   example,  when   {\AM}  was   filling  in   examples  of
Set-Intersection, it was noticed that some pairs of sets  were mapped
into the  extreme kind of set  Empty-set. The above rule  then had {\AM}
define  the concept of  {\sl Disjointness}: pairs of  sets having empty
intersection.

\subsectionbegin[3.2]{The Theory of Creating New Concepts}
All the heuristic  rule must do is  to fill in enough  facets so that the
new concept  is disambiguated  from all  the others,  so  that it  is
``defined'' clearly.   Should {\AM}  pause and fill  in lots of  facets at
that  time?  After all, several pieces  of information are trivial to
obtain at this  moment, but may be  hard to reconstruct later  (\eg,
the  reason why  C  was created).    On the  other  hand, filling  in
anything without a good  reason is a  bad idea (it  uses up time  and
space,  and  it won't  dazzle  the  user  as a  brilliant  choice  of
activity).

So the universal motto of {\AM} is to fill in facets of a new concept
if---and only  if---that  filling-in activity will  be much easier  at
that moment than later on.

In almost  all cases, the following  \ffootnote{facets}{The reader may  wish to
glance    ahead    to    Section
5--2, to note the full range
of facets that any concept may possess: what their names are and the
kind of  information that  is stored  in each.} will  be  specified
explicitly in the heuristic  rule, and thus will get  filled in right
away:  Definitions, Algorithms,  Domain/range, Worth,  plus a  tie to
some related concept (\eg, if the new concept is a generalization of
Equality,  then   we  can   trivially  fill  in   an  entry   on  its
Specializations facet: ``Equality.'')

On  the other hand, the  following facets will  {\sl not} be trivial to
fill in: Con\-jec\-tures, Ex\-am\-ples, Gene\-rali\-za\-tions, 
Spe\-ciali\-za\-tions, and
In\-terest\-ing\-ness.   For example, filling in  the Spe\-ciali\-za\-tions
facet
of a new concept may involve creating some new concepts; finding some
entries  for its  Con\-jec\-tures  facet  may  involve a  great  deal  of
experimenting; finding  some Exam\-ples of it  may involve twisting its
definition around or searching.  None of these is easier to do at time of creation
than any other time, so it's  deferred until some reason for doing it
exists.

For each such ``time-consuming'' facet F, of the new concept C, one new
task gets added  to the agenda,  of the form  ``{\bf Fill in entries  for
facet F of concept C},'' with reasons of the form ``Because C was just
created,'' and also ``No  entries exist so  far on \footnote{C.F.''}{C.F is
an abbreviation for facet F of concept C.} Most of  the
tasks generated this way will have low priority ratings, and may stay
near the bottom  of the agenda until/unless they are re-suggested for
a new reason.

Using the Primes example, from the last subsection, we see that a new
task like ``{\bf Fillin specializations of Primes}'' was suggested with a
low rating,  and ``{\bf Fillin examples of Primes}'' was suggested with a
\footnote{mediocre}{Not as low a rating as the task just mentioned. Why?  Each
possible facet  has a worth rating  which is fixed once  and for all.
As an illustration, we mention that the facet Examples is rated  much
higher than  Specializations.   Why  is this?    Because looking  for
examples of a  concept is often a good expenditure of time, producing
the raw  data on which  empirical induction  thrives.   On the  other
hand, each  specialization of  the new  concept C  would itself  be a
brand new concept. So filling in entries for the Specializations facet
would be a very explosive  process.} rating.  The ratings  of these
tasks increase later on, when the same tasks are re-proposed for new reasons.

\subsectionbegin[3.3]{Another Illustration: Squaring a Number}
Let's take  another  simple (though  not atypical)  illustration of how
new concepts get created.

Assume  that  {\AM}  has  recently  discovered the  concept  of
multiplication, which it calls ``TIMES,'' and {\AM} decides that it  is very
interesting. A heuristic  rule exists which \footnote{says:}{By glancing back
at the  Primes example, in Section 2--4.3.1, you can imagine
what this rule actually looked like. There is nothing to be gained by
stretching  it out in  all its  glory, hence  I've taken  the liberty
condensing it, inserting pronouns, etc.}

\han2{{\it If a  newly-interesting  operation F(x,y)  takes  a pair  of  N's  as
arguments,}}

\han2{{\it Then create  a new  concept, a specialization of F,
called  F-Itself, taking  just one N  as
argument, defined as F(x,x), with initial worth Worth(F).}}

\yskip

\noindent In the  case of F = TIMES,  we see that F takes  a pair of numbers as
its arguments,  so the  heuristic rule  would have  {\AM}  create a  new
concept  called   TIMES-Itself,  defined  as   TIMES-Itself(x) =
TIMES(x,x).   That is, create the new  concept which is the operation
of {\sl squaring a number}.

What would  {\AM} do in  this situation?   The  global list of  concepts
would  be enlarged to  include the  new atom ``TIMES-Itself,''  and the
facets of  this  new  concept would  begin  to  be filled  in.    The
following facets would get filled in almost instantly:

\yskip

\han3{{\bf  NAME: {\it TIMES-Itself }}}

\han3{{\bf  DEFINITIONS: {\it }}}

\han5{{\bf ORIGIN: {\it $\lambda (x,y) \biglp $TIMES.\ DEFN$(x,x,y)\bigrp $}}}

\han3{{\bf  ALGORITHMS: {\it $\lambda (x) \biglp $ TIMES.\ ALG$(x,x)\bigrp $}}}

\han3{{\bf  DOMAIN/RANGE: {\it Number $\→$ Number }}}

\han3{{\bf  GENERALIZATIONS: {\it TIMES }}}

\han3{{\bf  WORTH: {\it 600 }}}

\yskip

\noindent The  name,   definition,   domain/range, 
generalizations, and worth  are   specified
explicitly by the heuristic rule.

The  lambda  expression  stored  under  the  definition facet  is  an
executable LISP predicate, which accepts two arguments and then tests
them to  see whether the second  one is equal to  TIMES-Itself of the
first argument.   It performs this test by calling upon the predicate
stored under the  definition facet of the  TIMES concept.  
Thus TIMES-Itself.Defn(4,16) will call on TIMES.Defn(4,4,16), and return
whatever value {\sl that} predicate returns (in this case, it returns True,
since 4$\times$4 does equal 16).

A  trivial
transformation  of  this definition provides  an  algorithm  for computing  this
operation. The algorithm says to call on the Algorithms facet of the concept
TIMES. Thus TIMES-Itself.Alg(4) is computed by 
calling on TIMES.Alg(4,4) and returning {\sl that} value (namely, 16).

The worth of TIMES was 600 at  the moment TIMES-Itself was created, 
and this becomes the  worth
of TIMES-Itself.

TIMES-Itself  is by  definition  a specialization  of  TIMES, so  the
SPE\-CIAL\-I\-ZA\-TIONS  facet of TIMES is  incremented to point  to this new
concept.  Like\-wise, the  GENERALIZATIONS facet of TIMES-Itself  points
to TIMES.

Note how easy it  was to fill in these facets  now, but how difficult
it might be later on, ``out of context.'' By way of contrast,
the task of filling in
{\sl Specializations} of TIMES-Itself will  be no harder later  on than
it is  right now,  so we may  as well defer  it until there's  a good
reason for it. This task will probably be added to the agenda with so
low a priority that {\AM}  will never get around to it,  unless some new
reasons for it emerge.

The   task  {\bf ``Fill-in   examples  of  TIMES-Itself''}   is  probably
worthwhile doing soon, but  again it won't be any  harder to do at  a
later time than  it is right now.   So it is not done  at the moment;
rather, it is added to the agenda (with a fairly high priority).

Incidentally, the  reader may be interested to know that the next few
tasks {\AM} selected  (in reality)  were to create  the inverse of  this
operation (\ie, integer square-root),  and then to create a new kind
of number, the ones which can be produced by squaring (\ie,  perfect
squares).   Perfect squares were  deemed worth having  around because
Integer-square-root is {\sl defined} precisely on that set of integers.

\sectionskip
\sectionbegin[4]{Heuristics Fill in Entries for a Facet}
The last two sections  dealt with how a heuristic rule  is able to
propose  new  tasks  and  create  new  concepts.  This  section  will
illustrate how a rule can find some entries for 
a specific facet of a specific concept.

\subsectionbegin[4.1]{An Illustration: ``Fill in Examples of Set-union''}
Typically,  the facet/concept involved will be the one mentioned in the
current task which was chosen from the agenda.
If the task ``{\bf Fillin Examples of Set-union}'' were plucked from the agenda,
then the ``relevant'' heuristics would be those useful for filling in
entries for the Examples facet of the Set-union concept.

Here's one such relevant heuristic, attached to the concept Activity:

\han2{{\it If the current task is to fill in examples of the
\ffootnote{{\it activity}}{``Activity''
is a general concept which includes operations, predicates,
relations, functions, etc.}
F,}}

\han2{{\it One  way to get them is  to run F on  randomly chosen examples of the
domain of F.}}

\noindent Of course, in the LISP implementation, this  situation-action rule is
not coded quite so neatly.  It would be more faithfully translated as
follows:

\han2{{\it If CURRENT-TASK matches (FILLIN EXAMPLES F$\←$anything)),}}

\han3{{\it and F isa Activity,}}


\han2{{\it Then carry out the following procedure:}}

\han3{{\it 1.  Find the domain of F, and call it D;}}

\han3{{\it 2.  Find examples of D, and call them E;}}

\han3{{\it 3.  Find an algorithm to compute F, and call it A;}}

\han3{{\it 4.  Repeatedly:}}

\han4{{\it 4a. Choose any member of E, and call it E1.}}

\han4{{\it 4b. Run A on E1, and call the result X.}}

\han4{{\it 4c. Check whether <E1,X> satisfies the definition of F.}}

\han4{{\it 4d. If so, then add <E1 $\→$ X> to the Examples facet of F.}}

\han4{{\it 4e. If not, then add <E1 $\→$ X> to the Non-examples facet of F.}}

\yskip

\noindent When
the  current  task  is ``{\bf Fillin  examples  of  Set-union},'' the
left-hand side of the  rule is satisfied,  so the  right-hand side is
run.

Step (1) says to locate  the domain of Set-union. The facet  labelled
Do\-main/\-Range, on the Set-union concept, contains the entry (SET SET $\→$
SET), which indicates that the domain is a pair of sets.
That is, Set-union is an operation which accepts (as its arguments) 
a pair of sets, and returns
(as its value) some new set.

Since the domain elements are sets,
step  (2)  says  to  locate  examples  of  sets.  The facet  labelled
Examples, on the Sets concept, points to a list of about 30 different
sets.  This includes $\{$Z$\}$,
 $\{$A,B,C,D,E$\}$,
 $\{$$\}$,
 $\{$A,$\{\{$B$\}\}\}, \ldots$

Step  (3) involves  nothing more  than accessing some randomly chosen
entry on  the Algorithms  facet of  Set-union. One  such entry  is  a
recursive LISP function of two arguments, which  halts when the first
argument is  the empty set, and otherwise   pulls an element out of
that set  and SET-INSERT's  it  into the  second argument,  and  then
recurs on the  new values of the two sets.   For convenience, we'll
refer to this algorithm as UNION.

We then enter the loop of Step (4).  Step (4a) has us choose one pair
of our examples of sets, say the first two $\{$Z$\}$ and
 $\{$A,B,C,D,E$\}$.  Step
(4b) has us run UNION on these two sets. The result is $\{$A,B,C,D,E,Z$\}$.
Step (4c)  has  us  grab  an entry  from  the  Definitions  facet  of
Set-union, and run it. A typical definition is this formal one:

\han3{{\bf  $\lambda$  (S1 S2 S3) }}

\han4{{\bf  	( AND}}

\han5{{\bf   		(For all x in S1, x is in S3)}}

\han5{{\bf  		(For all x in S2, x is in S3)}}

\han5{{\bf  		(For all x in S3, x is in S1 or x is in S2)}}

\han4{{\bf  $\ \ $ )}}


\yskip

\noindent It   is  run   on  the   three   arguments  S1=$\{$Z$\}$,
   S2=$\{$A,B,C,D,E$\}$,
 S3=$\{$A,B,C,D,E,Z$\}$.  Since it returns ``True,''
we proceed  to  Step (4d).  The  construct\par
\noindent <$\{$Z$\}$, $\{$A,B,C,D,E$\} \→$
$\{$A,B,C,D,E,Z$\}$> is added to Set-un\-ion.Ex\-am\-ples.

At this stage, control returns to the beginning of the Step (4) loop.
A new pair of sets is chosen, and so on.

But when would this loop  stop? Recall that each task has a time and a space
allotment (based on its priority value). 
If there are many different rules all claiming to be relevant to the current task,
then each one is allocated a small fraction of those time/space quanta.
When either of these resources is exhausted,
{\AM} would break away  at a ``clean'' point (just
after finishing a cycle of the Step (4) loop) and would move on to
a new heuristic rule for filling in examples of Set-union.

This concludes the demonstration  that a heuristic rule really can be
executed to produce the  kinds of entities  requested by the  current
task.

\subsectionbegin[4.2]{Heuristics Propose New Conjectures}
We saw in the sample excerpt (Chapter 2) that {\AM} occasionally notices
some  unexpected  relationship,  and  formulates  it  into a  precise
conjecture.  Below is an example of  how this is done.  As you  might
guess from the placement of this subsection,
the mechanism is our old
friend the heuristic rule which fills in entries for certain facets.

In fact, a conjecture evolves through four stages:

\listbegin
\numlistentry[1]{A heuristic  rule looks  for a  particular kind  of relationship.
This will typically be of the form ``X is a Generalization of Y,'' or ``X is
an example of Y,'' or ``X is the same as Y,''
or \ffootnote{``F1.Defn(X,Y).''}{This says that F1(X)=Y, 
where F1 is an active  concept {\AM} knows about.}}

\numlistentry[2]{Once found, the relationship is checked, using supporting contacts.
A great deal of empirical evidence must favor it,
and any contradictory evidence must be ``explained away'' somehow.}

\numlistentry[3]{Now it is believed, and  {\AM} prints it out to the user. 
It is added as a new entry to the Conjecs facet of both concepts X and Y.
It  is also
added as an entry to the Examples facet of the Conjecture concept.}

\numlistentry[4]{Eventually,  {\AM} will get around  to the task  ``{\bf Check Examples of
Con\-jec\-ture},''
or to the task ``{\bf Check Conjecs of X}.''
If {\AM} had any concepts for proving conjectures,  they
would then be invoked. In the current LISP implementation, these are 
absent.
Nevertheless, several ``checks'' are performed: ({\it i\/}) see if any new empirical
evidence (pro or con) has appeared recently; ({\it ii\/}) see if this conjecture can
be strengthened; ({\it iii\/}) check it for extreme cases, and modify it if necessary; and
({\it iv\/}) modify the worth ratings of the concepts involved in the conjecture.}
\listend

The left-hand side of such  a heuristic rule will be  longer and more
complex  than  most other  kinds,  but the  basic  activities  of the
right-hand side will still  be filling in an  entry for a  particular
facet. 

The entries filled in will include:
({\it i\/}) a   new  example   of  Conjectures,
({\it ii\/}) a new entry for the Conjec facet of each concept involved in the conjecture,
({\it iii\/}) if we're claiming that
concept X is a generalization of concept Y, then ``X'' would be added to the
Generalizations facet of Y, and ``Y'' added to the Specializations facet of X,
({\it iv\/}) if X is an Example of Y, 
``X'' is added to the Examples facet of Y, and ``Y'' is added to the ISA facet of X.

The right-hand side  may also  involve  adding new  tasks to  the agenda,
creating new concepts, and modifying entries of  particular facets of
particular concepts.  As is true of all heuristic rules, 
both  sides of this type of conjecture-perceiving rule
may run any little functions they  want to: any functions which are quick and  have
no side effects (\eg, FOR-ALL tests,  PRINT functions, accesses to
a specified facet of some concept).

\subsectionbegin[4.3]{An Illustration: ``All Primes Except 2 are Odd''}
As an illustration, here is a heuristic  rule, relevant when checking
examples of any concept:

\han2{{\it If the current task is to Check Examples of X,}}

\han3{{\it and (Forsome Y) Y is a generalization of X,}}

\han3{{\it and Y has at least 10 examples,}}

\han3{{\it and all examples  of Y (ignoring boundary cases) are also examples of X,}}

\han2{{\it Then print the following conjecture: X is really no  more specialized than Y,}}

\han3{{\it and add it to the Examples facet of Conjectures,}}

\han3{{\it and if the user asks, inform the user that the evidence for this was that
all  $\|$Examples(Y)$\|$ Y's  (ignoring boundary examples  of Y's) turned
out to be X's as well,}}

\han3{{\it and Check the truth of this conjecture on boundary examples of Y,}}

\han3{{\it and add ``X'' to the Generalizations facet of Y,}}

\han3{{\it and add ``Y'' to the Specializations facet of X,}}

\han3{{\it and (if there is an entry in the Generalizations facet  of Y) add the
following task to  the agenda {\bf ``Check examples of  Y''}, for the reason:
``Just as Y was no more general than X, one-of Generalizations(Y)  may
turn out to be no more general than Y,'' with a rating for that reason
computed       as: 
 $ .4\cdot \|Examples(Generalizations(Y))\| +
 .3\cdot \|Examples(Y)\| + 
 .3\cdot Priority(Current\  task)$.}}

\yskip

\noindent Let's take a particular instance where this rule would be useful. Say
the  current  task  is ``{\bf Check  examples  of  Odd-primes}.''  The
left-hand side  of  the  rule  is  run,  and  is  satisfied when  the
generalization Y  is the  concept ``Primes.''   Let's see  why this  is
satisfied.

One of entries of the Generalization facet of Odd-primes is ``Primes.''
{\AM} grabs hold of the 30  examples of primes (located on the  Examples
facet of Primes), and  removes the ones which are  tagged as boundary
examples  (``2'' and  ``3'').    A definition  of  Odd-prime  numbers is
obtained (Definitions facet  of Odd-primes),  and it is  run on  each
remaining example  of primes (5, 7,  11, 13, 17, \ellipsis).   Sure enough,
they  all satisfy the  definition.  So all  primes (ignoring boundary
cases)  appear  to be  odd.    The  left-hand side  of  the  rule  is
satisfied.

At this  point, the user  sees a message  of the form  ``Odd-primes is
really no more specialized than Primes.''   If the user interrupts and  asks
about it,  {\AM} responds that  the evidence  for this was  that all  30
primes  (ignoring  boundary  examples of  primes)  turned  out to  be
Odd-primes.

Of the boundary examples (the numbers 2 and 3), only  the integer
``2'' fails  to be  an odd-prime, so  the the user  is notified  of the
finalized   conjecture:  ``All  primes  (other   than  `2')  are  also
odd-primes.''   This is added  as an  entry on  the Examples facet  of
the concept named `Conjectures.'

Before  beginning all this,  the Generalizations facet  of Odd-primes
pointed to Primes. Now, this rule has us add ``Primes'' as an entry  on
the  Specializations facet  of  Odd-primes.  Thus  Primes is  both  a
generalization and a specialization of Odd-primes (to within a single
stray exception), and {\AM} will be able to treat these two  concepts as
if they were merged together.  They are still kept separate, however,
in  case {\AM} ever needs to know  precisely what the difference between
them is, or in case later evidence shows the conjecture to be
\ffootnote{false.}{When  space  is  exhausted, one  emergency 
measure  {\AM}  takes is  to
destructively coalesce  a pair  of concepts  X,Y where  X is  both  a
generalization of and a specialization of Y, even if there are a couple
``boundary'' exceptions.}\par
The final action  of the right-hand side of this rule is to propose a
new task (if there exist some generalizations of the concept Y, which
in our case  is ``Primes'').  So {\AM} accesses  the Generalizations facet
of Primes, which is ``(Numbers).'' A new task is therefore added to the
agenda: ``{\bf Check  examples of Primes},'' with an associated  reason:
``Just as Primes  was no more general than  Odd-primes, so Numbers may
turn out to  be no  more general than  Primes.'' The  reason is  rated
according to  the formula given  in the rule;  say it gets  the value
500.

To  make this example a  little more interesting,  let's suppose that
the task ``{\bf Check  examples  of  Primes}'' already  existed  on  the
agenda, but for the reason ``Many examples of Primes have been found,
but never  checked,'' with a rating for the reason of 100, and for the
task as a whole of 200.  The global  task-rating formula then assigns
the task  a new overall priority  of 600, because of  the new, fairly
good reason supporting it.

When that  task is  eventually chosen,  the heuristic  rule  pictured
above (at the beginning  of this subsection) will trigger  and will be run
again, with  X=Primes and Y=Numbers. That is,  {\AM} will be considering
whether (almost) all numbers are  primes.  The left-hand side of  the
heuristic rule will  quickly fail, since, \eg, ``6'' is  an example of
Numbers which does not satisfy the definition of Primes.

\subsectionbegin[4.4]{Another Illustration: Discovering Unique Factorization}
Below is  a heuristic  rule which  is a  key agent in  the process  of
``noticing''  the   fundamental  theorem  of  \ffootnote{arithmetic.}{The unique
factorization conjecture: any positive  integer {\it n} can be  represented
as  the product  of prime  numbers in  precisely one  way (to  within
reorderings  of those prime factors).  Thus 28 =  2x2x7, and we don't
distinguish between the factorization  (2 2 7) and  (2 7 2).}

\han4{{\it If F({\rm a}) is unexpectedly  a BB,}}

\han4{{\it Then maybe ($\forall $x) F(x) is a BB.}}

\noindent Below, the same rule is given in more detail.
The first conjunct on the IF-part of the heuristic rule indicates that it's
relevant  to checking examples  of any  given operation F.  A typical
example is selected at random, say F(x)=y. Then y is examined, to see
if it satisfies any more stringent properties than those specified in
the  Domain/range facet of F.   That is, the  Domain/range facet of F
contains an entry of the form A$\→$B;  so if x is an A, then all  we are
guaranteed about y  is that it is an  example of a B.   But now, this
heuristic  is asking  if y  isn't really  an example  of a  much more
specialized concept than B.  If it is (say it's an  example of a BB),
then the  rest of the examples of  F are examined to see  if they too
satisfy this same property. If all examples appear to map from domain
set A into  range set BB (where BB  is much more restricted  than the
set  B specified originally in  the Domain/range facet of  F), then a
new conjecture is  made: the domain/range  of F  is really A$\→$BB,  not
A$\→$B.  Here is that rule, in crisper notation:

\yskip

\han2{{\it If the current task is to Check Examples of the operation F,}}

\han3{{\it and F is an operation from domain A into range B,}}

\han3{{\it and F has at least 10 examples,}}

\han3{{\it and a typical one of these examples is ``<x$\,\→\,$y>'' 
(so x$\,\epsilon $A and y$\,\epsilon$ B),}}

\han3{{\it and (Forsome Specialization BB of B), y is a BB.}}

\han3{{\it and {\bf all}  examples of F (ignoring  boundary cases) turn out  to be BB's,}}

\han2{{\it Then print the following conjecture: ``F(a) is always a BB, 
not simply a B,''}}

\han3{{\it and add it to the Examples facet of Conjectures concept,}}

\han3{{\it and add ``<A  $\→$ BB>'' as a  new entry to  the Domain/range facet of  F,
replacing ``<A  $\→$  B>,''}}

\han3{{\it and if the user asks,  inform him that the evidence for this was that
all $\|$Examples(F)$\|$ examples of F (ignoring boundary examples) turned
out to be BB's,}}

\han3{{\it and check  the  truth of  this conjecture  by running  F on  boundary examples of A.}}

\yskip

\noindent Let's see how this  rule was used in one instance. In Task 79  in the
sample excerpt  in Chapter  2, {\AM}  defined  the concept  Prime-times,
which was  a function transforming any  number {\it n} into the  set of all
factorizations  of {\it n} into primes.  For example, 
Prime-times(12)=$\{$(2 2 3)$\}$, 
Prime-times(13)=$\{$(13)$\}$.   The  domain of  F=Prime-times was  the
concept Numbers.  The range was  Sets. More precisely,  the range was
Sets-of-Bags-of-Numbers, but  {\AM} didn't  know  that concept  at  that
time.

The above  heuristic rule was  applicable. F  was Prime-times, A  was
Numbers, and B was  Sets.  There were far more than 10 known examples
of Prime-times  in action.  A typical  example was  this  one: <21  $\→$
$\{$(3,7)$\}$>.    The  rule  now  asked   that  $\{$(3,7)$\}$  be  fed  to  each
specialization  of  Sets,  to  see  if  it  satisfied  any  of  their
definitions.  The  Specializations facet of  Sets was acccessed,  and
each  concept pointed  to was  run (its  definition was  run)  on the
argument   ``$\{$(3,7)$\}$.''      It   turned   out   that   Singleton   and
Set-of-doubletons were the only two specializations of Sets satisfied
by this example.   At this moment, {\AM} had narrowed down the potential
conjectures to these two:

\listbegin
\bullistentry{{\it Prime-times(x) is always a singleton set.}}
\bullistentry{{\it Prime-times(x) is always a set of doubletons.}}
\listend

\noindent Each example  of Prime-times  was examined,  until one  was found  to
refute   each  conjecture   (for   example,  <8 $ \→\{$(2,2,2)$\}$>   destroys
conjecture 2).   But no example was able to disprove conjecture 1. So
the heuristic rule  plunged forward, and printed  out to the user  ``A
new conjecture: Prime-times({\it n})  is always a singleton-set, not simply
a set.''   The  entry <Numbers$\→$Singleton-sets> was  added to  the
Domain/range facet of Prime-times, replacing the old entry <Numbers$\→$Sets>.

Let's  digress for a  moment to discuss  the robustness of  the system.
What if this  heuristic were to  be excised?  Could {\AM} still  propose
unique factorization?   The answer  is yes,  there are other  ways to
notice  it. If  {\AM} has  the concept  of a \ffootnote{Function,}{A single-valued
relation.  That is, for any domain element x, F(x) contains precisely
one  member.  It  is never  empty (\ie, undefined),  nor is  it ever
larger than a singleton (\ie, multiple-valued).} then a  heuristic
rule like the one in the previous subsection will
cause {\AM}  to ask if Prime-times is not merely  a Relation, but also a
Function.

The past few sections should have convinced the reader  that isolated
heuristic rules really can do all kinds of things: propose new tasks,
create   new   concepts,  fill   in   entries  for   specific  facets
(goal-driven),  and  look  for  conjectures   (data-driven  empirical
induction).   The rules appear  fairly \ffootnote{general}{That is, applicable in
many situations. It  would be worse  than useless if  a rule  existed
which could lead  to a single  discovery like ``Fibonacci  series'' but
never  lead  to  any other  discoveries.  The  reasons for  demanding
generality are not only ``fairness,'' but the insights which occur when
it is observed that several  disparate concepts were all motivated by
the  same general principle (\eg, ``looking  for the inverse image of
extrema'').}---though that must be later verified empirically.  They
are  redundant  in a  pleasing  way: some  of  the most  ``important,''
well-known, and interesting conjectures  can (apparently) be  derived
in many ways.  Again, we must justify this experimentally.

\sectionskip
\sectionbegin[5]{Gathering Relevant Heuristics}
Each concept has facets which (may) contain some heuristics.  Some of these
are  for  filling  in,  some  for  checking,  some for  deciding
\footnote{interestingness,}{The  reader  has already  seen  several  heuristics
useful for  filling in and checking  facets; here is one  for judging
interestingness: an  entry on the Interest facet of Compose says that
a composition A$\circ $B  is more interesting if  the range of B  {\sl equals} the
domain of A, rather than if if they only {\sl partially}
overlap.} some for noticing new conjectures, etc.

{\AM} contains hundreds of these heuristics.  In order to save time (and
to make {\AM}  more rational), each heuristic should only be tried
in situations where it might apply, where it makes sense, in its
``domain of applicability.''

\subsectionbegin[5.1]{Domain of Applicability}
How is {\AM}  able to zero  in on the  relevant heuristic rules,  once a
task has been selected from the agenda?

The  secret  is that  each  heuristic rule  is  stored  somewhere {\it a
propos} to  its ``domain of  applicability.'' This  ``proper place''  is
determined by the first conjunct in the left-hand side of the rule.
What does this mean?  Consider this heuristic:

\han1{{\it If the current task  is to fill in examples of the  operation F, 
\hskip 14pt plus 5pt minus 4pt  $\←$===}}

\han3{{\it and some examples of the domain of F are known,}}

\han1{{\it Then one  way to get  examples of F  is to run  F on  randomly chosen
examples of the domain of F.}}

\yskip

\noindent This is a  reasonable thing to try---but only in certain situations.
Should it be tried when the current task is to check the  Worth facet
of the Sets concept? No, it would  be irrational.  Of course, even if
it were  tried then, the left-hand side would fail very quickly.  Yet
{\sl some}  cpu time  would  have  been  used, and  if  the  user  were
watching, his opinion of {\AM} would \footnote{decrease.}{This notion of
worrying about a human user who is observing {\AM} run in real time may
appear to be quite language- and machine-dependent. An increase in speed of a couple
orders of magnitude would radically alter the qualitative appearance of {\AM}.
In Chapter 7, however, the reader will grasp how difficult it
is to objectively rate a system like {\AM}. For that reason, all measures
of judgment must be respected.  Also, to the actual human being using the
system  this really is one of the more important measures.}\par
That particular  heuristic has a precise domain  of applicability: {\AM}
should use it whenever the current task is to fill in examples of  an
operation, and only in those kinds of situations.

The key observation  is that a  heuristic typically applies  to {\sl all
examples  of  a  particular  concept  C}.    In  the  case  we  were
considering,  C=Operation.   Intuitively,  we'd  like  to  tack  that
heuristic onto  the Examples  facet of the  concept Operation,  so it
would only ``come to mind'' in appropriate situations.  This is
precisely where the heuristic rule {\sl is} stored.

Initially, the author identified the proper concept C and facet F for
each heuristic H  which {\AM} possessed, and tacked  H onto \ffootnote{C.F.}{Recall
that C.F is an abbreviation for facet F of concept C.}  This was all
preparation, completed long before {\AM} started up.  Each heuristic was
tacked  onto  the  facet  which  uniquely  indicates  its  domain  of
applicability.  The first conjunct  of the IF-part of each  heuristic
indicates where it  is stored and where it is  applicable. Notice the
little arrow ($\←$===) pointing to that conjunct \footnote{above.}{In the LISP
implementation,  these  first   conjuncts  are  omitted,  since   the
{\sl placement} of  a heuristic  serves the same  purpose as if  it had
some ``pre-preconditions'' (like  these first  conjuncts) to  determine
relevance quickly.}\par
While {\AM} is running, it will choose a task dealing with, say, facet F
of concept  C.  {\AM} must quickly locate  the heuristic rules which are
relevant to  satisfying that  chosen  task.   {\AM} simply  locates  all
concepts  which claim  C as  an example.   If  the current  task were
``{\bf Check  the Domain/range of} \footnote{{\bf Union$\circ$Union},''}{This
operation is
defined as: $Union\circ Union(x,y,z) =↓{df} (x \union  y) \union z$.  
It  accepts three
sets as  arguments, and returns  a new set as  its value.} then C
would be Union$\circ$Union. Which concepts claim  C as an example?   They
include    Compose-with-Self,   Composition,    Operation,    Active,
Any-concept,  and Anything.   {\AM} then collects  the heuristics tacked
onto facet  F (in  this case,  F is  Domain/range) of  each of  those
concepts. All such heuristics will  be relevant. In the current case,
some  relevant  heuristics might  be  garnered from  the Domain/range
facet of the  concept Operation.  Any  heuristic which can deal  with
the Domain/range  facet of {\sl any} operation can  certainly deal with
Union$\circ$Union's    Domain/range.         A     typical    rule     on
\footnote{Operation.Domain/range.Check}{The   ``Check''   subfacet    of   the
``Domain/range''  facet of the  ``Operation'' concept.} would  be this
one:

\han2{{\it If a Dom/ran entry of F is of the form <D D D$\ldotss$D $\→$ R>, 

\han3{{\it and  R is a generalization of D,}}}}

\han2{{\it Then test whether the range might not be simply D.}}

\yskip

\noindent Suppose  one  entry   on  Union$\circ$Union.Dom/ran  was ``<Nonempty-sets
Nonempty-sets  Nonempty-sets $\→$ Sets>.''  Then  this last heuristic rule
would be relevant, and would  have {\AM} ask the plausible  question: Is
the  union of three  nonempty sets  always nonempty?   The  answer is
affirmative, empirically, so {\AM} modifies that Domain/range entry  for
Union$\circ$Union.       {\AM}   would   ask    the   same    question   for
Intersect$\circ$Intersect.  Although the  answer then would  be ``{\sl No},''
it's still a rational inquiry.   If {\AM} called on this heuristic  rule
when the  current task were ``{\bf Fillin specializations of  Bags},'' it
would  clearly be an irrational act.   The domain of applicability of
the rule is clear, and is precisely fitted to the slot where the rule
is stored (tacked onto Operation.Domain/range).

To recap the basic idea: when  dealing with a task ``{\bf Do act A on facet
F of concept C},'' {\AM} must locate  all the concepts X claiming C as  an
example. {\AM} then gathers  the heuristics tacked onto X.F.A,  for each
such general concept X.  All of them---and only they---are relevant
to satisfying that task.

So the whole problem of locating relevant heuristics has been reduced
to the problem  of efficiently finding all concepts of  which C is an
example (for  any given concept C).  This process is called ``{\sl rippling
away from C  in the ISA  direction},'' and forms  the subject of  the
next subsection.

\subsectionbegin[5.2]{Rippling}
Given a concept C, how can {\AM} find all  the concepts which claim C as
an example?

The most obvious  scheme is to store this information explicitly.  So
the Examples facet of C would  point to all known examples of C,  and
the Isa facet  of C would point  to all known concepts  claiming C as
one of their examples.  Why  not just do \ffootnote{this?}{This is
the implementation chosen by several projects, \eg MOLGEN
[Stefik 78].}   Because one can substitute a
modest amount of processing  time (via chasing links around)  for the
vast amount of storage space that would be needed to have ``everything
point to everything.''

Each facet contains only enough pointers so that the entire graph  of
Exs/Isa and Spec/Genl links could be {\sl reconstructed}  if needed.  Since
``Genl''
is a transitive  relation, {\AM} can  compute that  Numbers is a
generalization of Mersenne-primes, if the facet  Mersenne-primes.Genl
contains  the  entry  ``Odd-primes,'' and  Odd-primes.Genl  contains  a
pointer to  ``Primes,'' and Primes.Genl points to ``Numbers.''  This kind
of ``{\sl rippling}'' activity is used to efficiently locate all concepts
related  to a given  one X.  In particular, {\AM}  knows how  to ``ripple
upward in  the Isa  direction,'' and  \footnote{quickly}{With  about 200  known
concepts, with each Isa facet and each Genl facet pointing to about 3
other  concepts, about  25 links  will  be traced  along in  order to
locate about a dozen final  concepts, each of which claims the  given
one as an example.  This whole rippling process, tracing 25 linkages,
uses  less than .01  cpu seconds,  in compiled Interlisp,  on a KI-10
type PDP-10.} locate all  concepts which  claim X  as one of  their
examples.

It  turns out  that {\AM}  cannot simply  call for  X.Isa, then  the Isa
facets of those concepts, etc., because Isa is not \footnote{transitive.}{If  x
isa y,  and y  isa z,  then x  is (generally) {\it NOT} a  z. This is  due to  the
intransitivity of  ``member-of.''  Generalization is transitive, on the
other hand, because ``subset-of'' is transitive.}  For the interested
reader, the algorithm  {\AM} uses to collect Isa's of  X is
given \footnote{below:}{For  the {\sl very} interested reader,
it  is explained in great detail
in the permanently-archived file RIPPLE[dis,dbl] at SAIL. Copies are available
from the author.}

\listbegin
\numlistentry[1]{{\it All  generalizations  of the  given  concept X  are located.    {\AM}
accesses X.Genl, then the Genl facets of {\bf those} concepts, etc.}}
\numlistentry[2]{{\it The ``Isa'' facet of each of those concepts is accessed.}}
\numlistentry[3]{{\it {\AM}  locates all generalizations of  these newly-found higher-level
concepts.  This is the  list of all known  concepts which claim X  as
one of their examples.}}
\listend

\noindent In regular form, one might express this rippling recipe more compactly as:
$Isas(x) \ \  =↓{df} \ \  Genl↑*(Isa(Genl↑*(x)))$.  

\subsectionbegin[5.3]{Ordering the Relevant Heuristics}
Now that all these  relevant heuristics have been assembled,  in what
order should {\AM} execute \ffootnote{them?}{The discussion below assumes that the
heuristics don't interact  with each other; \ie,  that each one  may
act independently of all others; \ie, they are not `strongly coupled'.
The validity of this simplification
is   tested  empirically  (see  Chapter  6)  and  discussed
theoretically (see Chapter  7) later.}  It is important  to
note that  the heuristics tacked  onto very general  concepts will be
applicable frequently, yet will not  be very powerful.  For  example,
here is a  typical heuristic rule  which is tacked onto  the Examples
facet of the very general concept Any-concept:

\han3{{\it If the current task is to fill in examples of any concept X,}}

\han3{{\it Then one  way to get them is  to symbolically
\footnote{{\it instantiate}}{``Symbolic
instantiation'' is a euphemism for  a bag of tricks which transform  a
declarative  definition   of  a  concept  into   particular  entities
satisfying that definition. The only constraint on the tricks is that
they not actually {\sl run} the  definition.  One such trick  might be:
if  the  definition  is  recursive,  merely  find  some  entity  that
satisfies the base  step. {\AM}'s symbolic instantiation
tricks are too hand-crafted to be of great
interest, hence  this will  not be  covered any  more deeply  here.   
The interested reader is directed to 
the pioneering work by
[Lombardi & Raphael 64], or the more recent literature on these techniques
applied to automatic program verification (\eg, [Moore 75],
[Balzer 78]).} a definition of X.}}

\yskip

\noindent It takes a tremendous amount of inference to squeeze a couple awkward
examples of  Intersect$\circ$Intersect  out  of that  concept's  definition.
Much time could  be wasted doing \footnote{so.}{Incidentally, this illustrates
why  no   single  heuristic  should  be  allowed  to  monopolize  the
processing of any one task.}\par
Just as  general heuristics  are  weak but  often relevant,  specific
heuristics are powerful but rarely relevant.  Consider this heuristic
rule,   which   is   attached   to   the   very   specific    concept
Compose-with-Self:


\han3{{\it If the current task is to fill in examples of the composition F$\circ$F,}}

\han3{{\it Then include any fixed-points of F.}}

\yskip

\noindent For  example,  since Intersect($\phi$,X)  equals  $\phi$,  so  must
Intersect$\circ$Intersect($\phi$,X,Y\footnote{).}{$\phi$ (read ``phi'')
is another  name for
the empty set, also written $\{\}$.  This last sentence thus  says that
since $\{\}\inter X  = \{\}$, then $(\{\} \inter X)\inter Y$ must 
also equal $\{\}$.} Assuming that such examples exist already on Intersect.Examples,
this heuristic will fill in  a few examples of  Intersect$\circ$Intersect with
essentially  no  processing  required.    Of  course  the  domain  of
applicability of this heuristic is minuscule.

As we expected, the  narrower its domain  of applicability, the  more
powerful and efficient a  heuristic is, and the less  frequently it's
useful.   Thus  in any  given situation, where  {\AM} has  gathered many
heuristic rules,  it  will  probably  be best  to  execute  the  most
specific ones first, and execute the most general ones last.

Below are summarized  the three main points that make  up {\AM}'s scheme
for  finding relevant  heuristics in a  ``natural'' way  and then using
them:

\listbegin
\bullistentry{Each 
heuristic is tacked  onto the most general concept  for which
it  applies:  it is  given  as large  a  domain  of applicability  as
possible. This  will maximize  its generality,  but leave  its  power
untouched.   This  brings it  closer to  the  ``ideal'' tradeoff  point
between these two quantities.}

\bullistentry{When the current task deals with concept C, {\AM} ripples away from C
and quickly locates all the concepts of which C is an example.   Each
of them will contain heuristics relevant to dealing with C.}

\bullistentry{{\AM}  then   applies  those  heuristics  in   order  of  increasing
generality.   You   may  wonder  how  {\AM}  orders  the  heuristics  by
generality. It  turns  out that  the  rippling process  automatically
gathers heuristics  in order of  increasing generality.   In the LISP
system, each rule is therefore executed as soon as it's found.  So {\AM}
nevers  wastes  time  gathering heuristics  it  won't  have  time  to
execute.}
\listend

\sectionskip
\sectionbegin[6]{{\AM}'s Starting Heuristics}
This section will briefly characterize the collection of
242 heuristic rules which {\AM} was originally given.
A complete listing of those rules is found in Appendix 2;
the rule numbers below refer to the numbering given in that appendix.

\subsectionbegin[6.1]{Heuristics Grouped by the Knowledge They Embody}
Many heuristics embody the belief that mathematics is an empirical inquiry.
That is, one approach to discovery is simply to perform experiments,
observe the results, thereby gather statistically significant amounts of data,
induce from that data some new conjectures or new concepts worth isolating,
and then repeat this whole process again.
Some of the rules which capture this spirit are numbers
21, 43--57, 91, 136--139, 146--148, 153--154, 212--216, 225, and 241.
As one might expect, most of these are ``Suggest'' type rules.
They indicate plausible moves for {\AM} to make, promising new tasks to
try, new concepts worth studying.
Almost all the rest are ``Fillin'' type rules, providing empirical methods
to find entries for a specified facet.

Another large set of heuristics is used to embody---or to modify---what
can be called ``focus of attention.'' When should {\AM} keep on the same track,
and when not? The first
rules expressing varying nuances of this idea are numbers
1--5. The last such rules are numbers 209--216.
Some of these rules are akin to goal-setting mechanisms (\eg, rule
141).
In addition, many of the ``Interest'' type rules have some relation to keeping
{\AM} interested in recently-chosen concepts (or: in concepts related to them,
\eg by Analogy, by Genl/Spec, by Isa/Exs,\ellipsis).

The remaining ``Interest'' rules are generally some re-echoing of the following
notion:  X is interesting if F(X) has an unexpected (interesting) value.
For example, in rule
26,
``F(X)'' is just ``Generalizations of X.''
In slightly more detail, the principle characteristics of 
interestingness are: 

\listbegin
\bullistentry{{\it symmetry (\eg, in an expanding analogy)}}
\bullistentry{{\it coincidence (\eg, in a concept being re-discovered often)}}
\bullistentry{{\it appropriateness (\eg, in choosing an operation H so that
G$\circ$H will have  nicer Domain/Range characteristics than G itself did)}}
\bullistentry{{\it recency (see the previous paragraph on focus of attention)}}
\bullistentry{{\it individuality (\eg, the first entity observed which
satisfies some property)}}
\bullistentry{{\it usefulness (\eg, there are many conjectures involving it)}}
\bullistentry{{\it association (\ie, the given concept is related to an
interesting one)}}
\listend

One group of heuristic rules embeds syntactic tricks for ge\-ne\-ra\-liz\-ing
definitions (Lisp predicates), specializing them, instantiating them,
symbolically evalu\-at\-ing them, inverting them, rudimentarily analyzing them,
etc. For example, see rules 
31 and 89.
Some rules serve other syntactic functions, like ensuring that
various limits aren't exceeded (\eg, rule 15),
that the format for each facet is adhered
to (\eg, rule 16),
that the entries on each facet are used as they are meant to be
(\eg, rules 9 and 59), etc.
Many of the ``Check'' type heuristics fall into this category.

Finally, {\AM} possesses a mass of miscellaneous  rules which evade categorization.
See, \eg,  rules 185 and 236. These range from genuine math
heuristics (rules which lead to discovery frequently) to simple data management
hacks.


\subsectionbegin[6.2]{Heuristics Grouped by How Specific They Are}
Another dimension of distribution of heuristics, aside from the above
{\sl functional}
one, is simply that of how high up in the Genl/Spec tree they are located.
The table below summarizes how the rules were distributed in that tree:

\vskip 2.5in

Here is a key to the column headings:

\parindent 0pt \bf 
 
LEVEL: {\it  How far down the Genl/Spec tree of concepts we are looking.}

\eg: {\it  A sample concept at that level.}

$\#$ Con's: {\it  The total number of concepts at that level.}

$\#$ w/Heur: {\it  How many of them have {\sl some} heuristics.}

$\#$ Heurs: {\it  The total number of heuristics attached to concepts at that level.}

Avg: {\it   The mean number of heuristics per concept, at that level.}

Avg w/Heur: {\it  ($\#$ Heurs) / ($\#$ w. Heurs) }

$\#$ Fillin: {\it  Total number of ``Fillin'' type heuristics at that level.}

$\#$ Sugg: {\it  Total number of ``Suggest'' type heuristics at that level.}

$\#$ Check: {\it  Total number of ``Check'' type heuristics at that level.}

$\#$ Int: {\it  Total number of ``Interestingness'' type heuristics at that level.}

\yyskip

\tenpoint\rm

The heuristic rules are seen {\sl not} to be distributed uniformly, homogeneously
among all the initial concepts. The extent of this skewing was not realized
by the author until the above table was constructed.
A surprising proportion of rules are attached to the very general concepts.
The top 10\%\ of the concepts contain 73\%\ of all the heuristics.
One notable exception is the ``Interest'' type heuristics: they seem more evenly
distributed throughout the tree of initial concepts.
This tends to suggest that future work on providing ``meta-heuristics'' should
concentrate on how to automatically synthesize those Interest heuristics for
newly-created concepts.

\worldend